8383631: Test vmTestbase/nsk/jdb/kill/kill003/kill003.java failed - possible NullPointerException#31048
8383631: Test vmTestbase/nsk/jdb/kill/kill003/kill003.java failed - possible NullPointerException#31048plummercj wants to merge 3 commits intoopenjdk:masterfrom
Conversation
|
👋 Welcome back cjplummer! A progress list of the required criteria for merging this PR into |
|
@plummercj This change is no longer ready for integration - check the PR body for details. |
|
@plummercj The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
|
test |
|
Unfortunately this fix does not work with virtual threads because one of the prompts ends up being old-m-a-i-n[1] instead of main[1]. I'm going to need a different fix. I was working towards one yesterday which I abandoned after I found the solution in the PR. I'll see if I can get that fix working. |
|
Please re-review. I had to make some pretty significant changes to the test to get the prompt issue resolved, and to fix the newly discovered problem that the test was not doing its job when run with virtual threads enable (using the virtual thread MainWrapper support). The initial fixed pushed for this PR is still in place, and is dealing with the fact that the test JdbTest support got confused about the "String[3]" that appeared in the output and thought it was a prompt. As pointed out this resulted in the test no longer recognize the old-m-a-i-n[1] prompt you get with virtual threads. This is because the original fix limits prompt recognition to just main[1]. When I looked for a solution for the old-m-a-i-n[1] prompt, I realized that with virtual threads the test was not even doing what it is suppose to. It is suppose to issue the "locals" command at the point where the implicit exception handler for the synchronized block rethrows the exception, so it should be at the athrow in the handler. The test was counting on jdb's default handling of uncaught exceptions to notify jdb when this athrow is done, so the test can then issue the "locals" command at this point. However, with virtual threads there is an exception handler installed in the thread run() method, so there is no uncaught exception notification. So we didn't actually end up in jdb again until we were back in the MainWrapper.main() rethrowing the exception, and this is the point where the locals command was being issued, which is way too late. I did a two things to fix this. First I added an exception handler to kill003.main(). This prevents the exception from escaping to MainWrapper.main(), which prevents ever having the old-m-a-i-n[1] prompt. This solved the unrecognized old-m-a-i-n[1] prompt issue that was introduced by the first fix. The other thing I did is have the test setup jdb to notify about all exceptions being thrown. This way jdb is always notified of the athrow we care about, and we are not relying on the exception being uncaught to get the notification (something jdb is setting up by default, not the test). Because of this 2nd change, an extra "cont" is needed in the test. It is done when the NPE is thrown the first time (as part of the "kill" command). Previously this was not causing jdb to be notified (and stop), but now it does, so we need to continue. The purpose of this test is to expose the hotspot assert that was fixed by JDK-8074292. I reverted that fix and verified that this test still triggers the assert fixed by JDK-8074292, and does so both with and without virtual threads. I also verified that without the updated changes to the test, it previously was not triggering the assert when using virtual threads, so this was definitely a bug and is now fixed. I also tested with tier1 CI and also svc CI testing for tier2 and tier5, which both run the nsk/jdb tests. |
The test sees the following output and is supposed to detect the "main[1]" prompt to indicate it is done with the "locals" command that was issued, and then issue a "cont" command:
[9:13:15.40] Sending command: locals
[9:13:15.560] reply[0]: Method arguments:
[9:13:15.561] reply[1]: args = instance of java.lang.String[3] (id=669)
[9:13:15.561] reply[2]: Local variables:
[9:13:15.561] reply[3]: main[1]
[9:13:15.561] Sending command: cont
However, the output instead looks like this:
[21:15:18.114] Sending command: locals
[21:15:18.515] reply[0]: Method arguments:
[21:15:18.515] reply[1]: args = instance of java.lang.String[3] (id=686)
[21:15:18.515] reply[2]: Local variables:
[21:15:18.515] Sending command: cont
[21:15:18.716] reply[0]: main[1] >
The JdbTest.findPrompt() code looks for a pattern of characters, followed by '[', then a number, then ']'. Unfortunately it matches the String[3] text you see in the output. Because of that the test thought the "locals" command had completed, and issued the "cont" command too soon, which gets the test out of sync.
Apparently some tests have had this same issue before and a solution was already available. You just need to set compoundPromptIdent to the prompt that the test expects (sans the square brackets part).
I also fixed a couple of comment typos I noticed in JdbTest while debugging this.
Tested by running kill003 a couple hundred times on the failing platform and with the failing JVM args.
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/31048/head:pull/31048$ git checkout pull/31048Update a local copy of the PR:
$ git checkout pull/31048$ git pull https://git.openjdk.org/jdk.git pull/31048/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 31048View PR using the GUI difftool:
$ git pr show -t 31048Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/31048.diff
Using Webrev
Link to Webrev Comment